277 research outputs found

    Accelerating sparse restricted Boltzmann machine training using non-Gaussianity measures

    Get PDF
    In recent years, sparse restricted Boltzmann machines have gained popularity as unsupervised feature extractors. Starting from the observation that their training process is biphasic, we investigate how it can be accelerated: by determining when it can be stopped based on the non-Gaussianity of the distribution of the model parameters, and by increasing the learning rate when the learnt filters have locked on to their preferred configurations. We evaluated our approach on the CIFAR-10, NORB and GTZAN datasets

    The body as a reservoir: locomotion and sensing with linear feedback

    Get PDF
    It is known that mass-spring nets have computational power and can be trained to reproduce oscillating patterns. In this work, we extend this idea to locomotion and sensing. We simulate systems made out of bars and springs and show that stable gaits can be maintained by these structures with only linear feedback. We then conduct a classification experiment in which the system has to distinguish terrains while maintaining an oscillatory pattern. These experiments indicate that the control of compliant robots can be simplified if one exploits the computational power of the body’s dynamics

    Learning content-based metrics for music similarity

    Get PDF
    In this abstract, we propose a method to learn application-specific content-based metrics for music similarity using unsupervised feature learning and neighborhood components analysis. Multiple-timescale features extracted from music audio are embedded into a Euclidean metric space, so that the distance between songs reflects their similarity. We evaluated the method on the GTZAN and Magnatagatune datasets

    Memory in reservoirs for high dimensional input

    Get PDF
    Reservoir Computing (RC) is a recently introduced scheme to employ recurrent neural networks while circumventing the difficulties that typically appear when training the recurrent weights. The ‘reservoir’ is a fixed randomly initiated recurrent network which receives input via a random mapping. Only an instantaneous linear mapping from the network to the output is trained which can be done with linear regression. In this paper we study dynamical properties of reservoirs receiving a high number of inputs. More specifically, we investigate how the internal state of the network retains fading memory of its input signal. Memory properties for random recurrent networks have been thoroughly examined in past research, but only for one-dimensional input. Here we take into account statistics which will typically occur in high dimensional signals. We find useful empirical data which expresses how memory in recurrent networks is distributed over the individual principal components of the input

    Recurrent kernel machines : computing with infinite echo state networks

    Get PDF
    Echo state networks (ESNs) are large, random recurrent neural networks with a single trained linear readout layer. Despite the untrained nature of the recurrent weights, they are capable of performing universal computations on temporal input data, which makes them interesting for both theoretical research and practical applications. The key to their success lies in the fact that the network computes a broad set of nonlinear, spatiotemporal mappings of the input data, on which linear regression or classification can easily be performed. One could consider the reservoir as a spatiotemporal kernel, in which the mapping to a high-dimensional space is computed explicitly. In this letter, we build on this idea and extend the concept of ESNs to infinite-sized recurrent neural networks, which can be considered recursive kernels that subsequently can be used to create recursive support vector machines. We present the theoretical framework, provide several practical examples of recursive kernels, and apply them to typical temporal tasks

    One step backpropagation through time for learning input mapping in reservoir computing applied to speech recognition

    Get PDF
    Recurrent neural networks are very powerful engines for processing information that is coded in time, however, many problems with common training algorithms, such as Backpropagation Through Time, remain. Because of this, another important learning setup known as Reservoir Computing has appeared in recent years, where one uses an essentially untrained network to perform computations. Though very successful in many applications, using a random network can be quite inefficient when considering the required number of neurons and the associated computational costs. In this paper we introduce a highly simplified version of Backpropagation Through Time by basically truncating the error backpropagation to one step back in time, and we combine this with the classic Reservoir Computing setup using an instantaneous linear readout. We apply this setup to a spoken digit recognition task and show it to give very good results for small networks

    A hierarchy of recurrent networks for speech recognition

    Get PDF
    Generative models for sequential data based on directed graphs of Restricted Boltzmann Machines (RBMs) are able to accurately model high dimensional sequences as recently shown. In these models, temporal dependencies in the input are discovered by either buffering previous visible variables or by recurrent connections of the hidden variables. Here we propose a modification of these models, the Temporal Reservoir Machine (TRM). It utilizes a recurrent artificial neural network (ANN) for integrating information from the input over time. This information is then fed into a RBM at each time step. To avoid difficulties of recurrent network learning, the ANN remains untrained and hence can be thought of as a random feature extractor. Using the architecture of multi-layer RBMs (Deep Belief Networks), the TRMs can be used as a building block for complex hierarchical models. This approach unifies RBM-based approaches for sequential data modeling and the Echo State Network, a powerful approach for black-box system identification. The TRM is tested on a spoken digits task under noisy conditions, and competitive performances compared to previous models are observed

    Multiscale approaches to music audio feature learning

    Get PDF
    Content-based music information retrieval tasks are typically solved with a two-stage approach: features are extracted from music audio signals, and are then used as input to a regressor or classifier. These features can be engineered or learned from data. Although the former approach was dominant in the past, feature learning has started to receive more attention from the MIR community in recent years. Recent results in feature learning indicate that simple algorithms such as K-means can be very effective, sometimes surpassing more complicated approaches based on restricted Boltzmann machines, autoencoders or sparse coding. Furthermore, there has been increased interest in multiscale representations of music audio recently. Such representations are more versatile because music audio exhibits structure on multiple timescales, which are relevant for different MIR tasks to varying degrees. We develop and compare three approaches to multiscale audio feature learning using the spherical K-means algorithm. We evaluate them in an automatic tagging task and a similarity metric learning task on the Magnatagatune dataset

    Building a patient-specific seizure detector without expert input using user triggered active learning strategies

    Get PDF
    Purpose: Patient-specific seizure detectors outperform general seizure detectors, but building them requires lots of consistently marked electroencephalogram (EEG) of a single patient, which is expensive to gather. This work presents a method to bring general seizure detectors up to par with patient-specific seizure detectors without expert input. The user/patient is only required to push a button in case of a false alarm and/or missed seizure. Method: For the experiments the 'CHB-MIT Scalp EEG Database' was used, which contains pre-surgically recorded EEG of 24 patients. The seizure detector used is based on (Buteneers et al. Epilepsy Research 2012:(in press)) combined with the preprocessing technique presented in (Shoeb et al. Epilepsy & Behavior 2004;5:483-598). Button presses mark the corresponding data and add it to the training set of the system. The performance is evaluated using leave-one-hour-out cross-validation to attain statistically relevant results. Results: For the patient-specific seizure detector 34(32)% (average(standard deviation)) of the detections are false, 8(14)% of the seizures are missed and a detection delay of 11(10)s is reached. The general seizure detector achieves: 86(89)%, 28(41)% and -35(82)s, respectively. Adding only false positives, the patient specific performance is achieved in 9 of the 24 patients. Adding missed seizures allows the patient-specific performance to be reached in 21 patients (about 90%). Conclusion: This work shows that in order to build a patient-specific seizure detector, no patient-specific EEG data is required for up to 90% of the patients using the presented technique
    corecore